This will exclude biomass, exchange and demand reactions as they are unbalanced by definition. It will also fail all reactions where at least one metabolite does not have a formula defined. In steady state, for each metabolite the sum of influx equals the sum of efflux. Hence the net masses of both sides of any model reaction have to be equal. Reactions where at least one metabolite does not have a formula are not considered to be balanced, even though the remaining metabolites participating in the reaction might be. Implementation: For each reaction that isn't a boundary or biomass reaction check if each metabolite has a non-zero elements attribute and if so calculate if the overall element balance of reactants and products is equal to zero.
A total of 37 (4.50%) reactions are mass unbalanced with at least one of the metabolites not having a formula or the overall mass not equal to 0: GLYTRS, GLUTRS, e_Protein, e_DNA, GLUTRS_Gln, ...
This will exclude biomass, exchange and demand reactions as they are unbalanced by definition. It will also fail all reactions where at least one metabolite does not have a charge defined. In steady state, for each metabolite the sum of influx equals the sum of efflux. Hence the net charges of both sides of any model reaction have to be equal. Reactions where at least one metabolite does not have a charge are not considered to be balanced, even though the remaining metabolites participating in the reaction might be. Implementation: For each reaction that isn't a boundary or biomass reaction check if each metabolite has a non-zero charge attribute and if so calculate if the overall sum of charges of reactants and products is equal to zero.
A total of 0 (0.00%) reactions are charge unbalanced with at least one of the metabolites not having a charge or the overall charge not equal to 0:
A large fraction of model reactions able to carry unlimited flux under default conditions indicates problems with reaction directionality, missing cofactors, incorrectly defined transport reactions and more. Implementation: Without changing the default constraints run flux variability analysis. From the FVA results identify those reactions that carry flux equal to the model's maximal or minimal flux.
A fraction of 17.53% of the non-blocked reactions (in total 64 reactions) can carry unbounded flux in the default model condition. Unbounded reactions may be involved in thermodynamically infeasible cycles: AHSERL2, MTHFCx, R03239, R03238, R03237, ...
The Sub Total is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
The Sub Total is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
The Sub Total is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
The Sub Total is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
The Systems Biology Ontology (SBO) allows researchers to annotate a model with terms which indicate the intended function of its individual components. The available terms are controlled and relational and can be viewed here http://www.ebi.ac.uk/sbo/main/tree. Implementation: Check if each cobra.Metabolite has a non-zero "annotation" attribute that contains the key "sbo".
A total of 794 metabolites (100.00%) lack annotation with any type of SBO term: k_e, xan_e, ptrc_e, sucr_e, btn_e, ...
The Sub Total is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
The Total Score is the result of the following calculation. For more information please click on "Readme" in the top left of the report.
If a reaction is neither a transport reaction, a biomass reaction nor a boundary reaction, it is counted as a purely metabolic reaction. This test requires the presence of metabolite formula to be able to identify transport reactions. This test simply reports the number of purely metabolic reactions that have fixed constraints and does not have any mandatory 'pass' criteria. Implementation: From the pool of pure metabolic reactions identify reactions which are constrained to values other than the model's minimal or maximal possible bounds.
A total of 2 (0.33%) purely metabolic reactions have fixed constraints in the model, this excludes transporters, exchanges, or pseudo-reactions: 0.0 <= GALKr <= 0.0, 1.0 <= ATPM <= 1.0
Cellular metabolism in any organism usually involves the transport of metabolites across a lipid bi-layer. This test reports how many of these reactions, which transports metabolites from one compartment to another, are present in the model, as at least one transport reaction must be present for cells to take up nutrients and/or excrete waste. Implementation: A transport reaction is defined as follows: 1. It contains metabolites from at least 2 compartments and 2. at least 1 metabolite undergoes no chemical reaction, i.e., the formula and/or annotation stays the same on both sides of the equation. A notable exception is transport via PTS, which also contains the following restriction: 3. The transported metabolite(s) are transported into a compartment through the exchange of a phosphate. An example of transport via PTS would be pep(c) + glucose(e) -> glucose-6-phosphate(c) + pyr(c) Reactions similar to transport via PTS (referred to as "modified transport reactions") follow a similar pattern: A(x) + B-R(y) -> A-R(y) + B(y) Such modified transport reactions can be detected, but only when the formula is defined for all metabolites in a particular reaction. If this is not the case, transport reactions are identified through annotations, which cannot detect modified transport reactions.
A total of 218 (22.66%) transport reactions are defined in the model, this excludes purely metabolic reactions, exchanges, or pseudo-reactions: TO0001012, TO0000305, TO0000208, TZ3101375, TO1000084, ...
Cellular metabolism in any organism usually involves the transport of metabolites across a lipid bi-layer. Hence, this test reports how many of these reactions, which transports metabolites from one compartment to another, have fixed constraints. This test does not have any mandatory 'pass' criteria. Implementation: Please refer to "Transport Reactions" for details on how memote identifies transport reactions. From the pool of transport reactions identify reactions which are constrained to values other than the model's median lower and upper bounds.
A total of 6 (2.75%) transport reactions have fixed constraints in the model: 0.0 <= TZ2900102 <= 0.0, 0.0 <= TO0000971 <= 0.0, 0.0 <= TZ2900051 <= 0.0, 0.0 <= TO0010516 <= 0.0, -999999.0 <= TR0000239 <= 0.0, ...
Identify reactions in a pairwise manner that use identical sets of genes. It does *not* take into account a reaction's directionality, compartment, metabolites or annotations. The main reason for having this test is to help cleaning up merged models or models from automated reconstruction pipelines as these are prone to having identical reactions with identifiers from different namespaces. Implementation: Compare reactions in a pairwise manner and group reactions whose genes are identical. Skip reactions with missing genes.
Based only on equal genes there are 134 different groups of identical reactions which corresponds to a total of 533 duplicated reactions in the model.
Gene-Protein-Reaction rules express which gene has what function. The presence of this annotation is important to justify the existence of reactions in the model, and is required to conduct in silico gene deletion studies. However, reactions without GPR may also be valid: Spontaneous reactions, or known reactions with yet undiscovered genes likely lack GPR. Implementation: Check if each cobra.Reaction has a non-empty "gene_reaction_rule" attribute, which is set by the parser if there is an fbc:geneProductAssociation defined for the corresponding reaction in the SBML.
There are a total of 68 reactions (7.07%) without GPR: UAG2E, e_Cofactor, ATPM, AHAL, Fatty_Acid_Entity, ...
As it is hard to identify the exact transport processes within a cell, transport reactions are often added purely for modeling purposes. Highlighting where assumptions have been made versus where there is proof may help direct the efforts to improve transport and transport energetics of the tested metabolic model. However, transport reactions without GPR may also be valid: Diffusion, or known reactions with yet undiscovered genes likely lack GPR. Implementation: Check which cobra.Reactions classified as transport reactions have a non-empty "gene_reaction_rule" attribute.
There are a total of 17 transport reactions (7.80% of all transport reactions) without GPR: T_Glyceraldehyde, T_Ethanol_diff, T_Diacetyl_diff, T_Oxygen_diff, T_Xylose_abc, ...
Based on the gene-protein-reaction (GPR) rules, it is possible to infer whether a reaction is catalyzed by a single gene product, isozymes or by a heteromeric protein complex. This test checks that at least one such heteromeric protein complex is defined in any GPR of the model. For S. cerevisiae it could be shown that "essential proteins tend to [cluster] together in essential complexes" (https://doi.org/10.1074%2Fmcp.M800490-MCP200). This might also be a relevant metric for other organisms. Implementation: Identify GPRs which contain at least one logical AND that combines two different gene products.
A total of 61 reactions are catalyzed by complexes defined through GPR rules in the model.
This test only yields sensible results if all biomass precursor metabolites have chemical formulas assigned to them. The molecular weight of the biomass reaction in metabolic models is defined to be equal to 1 g/mmol. Conforming to this is essential in order to be able to reliably calculate growth yields, to cross-compare models, and to obtain valid predictions when simulating microbial consortia. A deviation from 1 - 1E-03 to 1 + 1E-06 is accepted. Implementation: Multiplies the coefficient of each metabolite of the biomass reaction with its molecular weight calculated from the formula, then divides the overall sum of all the products by 1000.
The component molar mass of the biomass reaction e_Biomass sums up to 5.684341886080802e-16 which is outside of the 1e-03 margin from 1 mmol / g[CDW] / h.
There are universal components of life that make up the biomass of all known organisms. These include all proteinogenic amino acids, deoxy- and ribonucleotides, water and a range of metabolic cofactors. This test reports the amount of biomass precursors that have been reported to be essential constituents of the biomass equation. All of the following precursors need to be included in the biomass reaction to pass the test: Aminoacids: trp__L, cys__L, his__L, tyr__L, met__L, phe__L, ser__L, pro__L, asp__L, thr__L, gln__L, glu__L, ile__L, arg__L, lys__L, val__L, leu__L, ala__L, gly, asn__L DNA: datp, dctp, dttp, dgtp RNA: atp, ctp, utp, gtp Cofactors: nad, nadp, amet, fad, pydx5p, coa, thmpp, fmn and h2o These metabolites were selected based on the results presented by DOI:10.1016/j.ymben.2016.12.002 Please note, that the authors also suggest to count C1 carriers (derivatives of tetrahydrofolate(B9) or tetrahydromethanopterin) as universal cofactors. We have omitted these from this check because there are many individual compounds that classify as C1 carriers, and it is not clear a priori which one should be preferred. In a future update, we may consider identifying these using a chemical ontology. Implementation: Determine whether the model employs a lumped or split biomass reaction. Then, using an internal mapping table, try to identify the above list of essential precursors in list of precursor metabolites of either type of biomass reaction. List IDs in the models namespace if the metabolite exists, else use the MetaNetX namespace if the metabolite does not exist in the model. Identifies the cytosol from an internal mapping table, and assumes that all precursors exist in that compartment.
e_Biomass lacks a total of 20 essential metabolites (200.00% of all biomass precursors). Specifically these are: ['trp__L_c', 'cys__L_c', 'his__L_c', 'tyr__L_c', 'met__L_c', 'phe__L_c', 'ser__L_c', 'pro__L_c', 'asp__L_c', 'thr__L_c', 'gln__L_c', 'glu__L_c', 'ile__L_c', 'arg__L_c', 'val__L_c', 'leu__L_c', 'gly_c', 'asn__L_c', 'pydx5p_c', 'thmpp_c'].
The Non-Growth Associated Maintenance reaction (NGAM) is an ATP-hydrolysis reaction added to metabolic models to represent energy expenses that the cell invests in continuous processes independent of the growth rate. Memote tries to infer this reaction from a list of buzzwords, and the stoichiometry and components of a simple ATP-hydrolysis reaction. Implementation: From the list of all reactions that convert ATP to ADP select the reactions that match the irreversible reaction "ATP + H2O -> ADP + HO4P + H+", whose metabolites are situated within the main model compartment. The main model compartment is assumed to be the cytosol, yet, if that cannot be identified, it is assumed to be the compartment with the most metabolites. The resulting list of reactions is then filtered further by attempting to match the reaction name with any of the following buzzwords ('maintenance', 'atpm', 'requirement', 'ngam', 'non-growth', 'associated'). If this is possible only the filtered reactions are returned, if not the list is returned as is.
A total of 1 NGAM reactions could be identified: ATPM
The growth-associated maintenance (GAM) term accounts for the energy in the form of ATP that is required to synthesize macromolecules such as Proteins, DNA and RNA, and other processes during growth. A GAM term is therefore a requirement for any well-defined biomass reaction. There are different ways to implement this term depending on what kind of experimental data is available and the preferred way of implementing the biomass reaction: - Chemostat growth experiments yield a single GAM value representing the required energy per gram of biomass (Figure 6 of [1]_). This can be implemented in a lumped biomass reaction or in the final term of a split biomass reaction. - Experimentally delineating or estimating the GAM requirements for each macromolecule separately is possible, yet requires either data from multi-omics experiments [2]_ or detailed resources [1]_ , respectively. Individual energy requirements can either be implemented in a split biomass equation on the term for each macromolecule, or, on the basis of the biomass composition, they can be summed into a single GAM value for growth and treated as mentioned above. This test is only able to detect if a lumped biomass reaction or the final term of a split biomass reaction contains this term. Hence, it will only detect the use of a single GAM value as opposed to individual energy requirements of each macromolecule. Both approaches, however, have its merits. Implementation: Determines the metabolite identifiers of ATP, ADP, H2O, HO4P and H+ based on an internal mapping table. Checks if ATP and H2O are a subset of the reactants and ADP, HO4P and H+ a subset of the products of the biomass reaction. References: .. [1] Thiele, I., & Palsson, B. Ø. (2010, January). A protocol for generating a high-quality genome-scale metabolic reconstruction. Nature protocols. Nature Publishing Group. http://doi.org/10.1038/nprot.2009.203 .. [2] Hackett, S. R., Zanotelli, V. R. T., Xu, W., Goya, J., Park, J. O., Perlman, D. H., Gibney, P. A., Botstein, D., Storey, J. D., Rabinowitz, J. D. (2010, January). Systems-level analysis of mechanisms regulating yeast metabolic flux Science http://doi.org/10.1126/science.aaf2786
Yes, e_Biomass contains a term for growth-associated maintenance.
When a model is not sufficiently constrained to account for the thermodynamics of reactions, flux cycles may form which provide reduced metabolites to the model without requiring nutrient uptake. These cycles are referred to as erroneous energy-generating cycles. Their effect on the predicted growth rate in FBA may account for an increase of up to 25%, which makes studies involving the growth rates predicted from such models unreliable. Implementation: This test uses an implementation of the algorithm presented by: Fritzemeier, C. J., Hartleb, D., Szappanos, B., Papp, B., & Lercher, M. J. (2017). Erroneous energy-generating cycles in published genome scale metabolic networks: Identification and removal. PLoS Computational Biology, 13(4), 1–14. http://doi.org/10.1371/journal.pcbi.1005494 First attempt to identify the main compartment (cytosol), then attempt to identify each metabolite of the referenced list of energy couples via an internal mapping table. Construct a dissipation reaction for each couple. Carry out FBA with each dissipation reaction as the objective and report those reactions that non-zero carry flux.
Universally blocked reactions are reactions that during Flux Variability Analysis cannot carry any flux while all model boundaries are open. Generally blocked reactions are caused by network gaps, which can be attributed to scope or knowledge gaps. Implementation: Use flux variability analysis (FVA) implemented in cobra.flux_analysis.find_blocked_reactions with open_exchanges=True. Please refer to the cobrapy documentation for more information: https://cobrapy.readthedocs.io/en/stable/autoapi/cobra/flux_analysis/ variability/index.html#cobra.flux_analysis.variability. find_blocked_reactions
There are 224 (23.28%) blocked reactions in the model: AMID, R08233, R05590, AMID3, SGHA, ...
Dead-ends are metabolites that can only be produced but not consumed by reactions in the model. They may indicate the presence of network and knowledge gaps. Implementation: Find dead-end metabolites structurally by considering only reaction equations and reversibility. FBA is not carried out.
A total of 63 (7.93%) metabolites are not consumed by any reaction of the model: inost_c, aa_c, thcys_c, ind3ac_c, gdptp_c, ...
Stoichiometrically Balanced Cycles are artifacts of insufficiently constrained networks resulting in reactions that can carry flux when all the boundaries have been closed. Implementation: Close all model boundary reactions and then use flux variability analysis (FVA) to identify reactions that carry flux.
There are 80 (8.32%) reactions which participate in SBC in the model: AHSERL2, MTHFCx, R03239, R03238, R03237, ...
The in-silico growth prediction is compared with experimental data and the accuracy is expected to be better than 0.95. In principal, Matthews' correlation coefficient is a more comprehensive metric but is a little fragile to not having any false negatives or false positives in the output. Implementation: Read and validate experimental config file and data tables. Constrain the model with the parameters provided by a user's definition of the medium, then compute a confusion matrix based on the predicted true, expected true, predicted false and expected false growth. The individual values of the confusion matrix are calculated as described in https://en.wikipedia.org/wiki/Confusion_matrix
The in-silico gene essentiality is compared with experimental data and the accuracy is expected to be better than 0.95. In principal, Matthews' correlation coefficient is a more comprehensive metric but is a little fragile to not having any false negatives or false positives in the output. Implementation: Read and validate experimental config file and data tables. Constrain the model with the parameters provided by a user's definition of the medium, then compute a confusion matrix based on the predicted essential, expected essential, predicted nonessential and expected nonessential genes. The individual values of the confusion matrix are calculated as described in https://en.wikipedia.org/wiki/Confusion_matrix